Great Duck Island Storm Petrels

Leach’s Storm Petrel

Leach’s Storm Petrel

Data source: Gnam and Sheridan 2023.

Study Site

Methods

Methods

Data

Variables of Interest

  • Side of island

Variables of Interest

  • Side of island
  • Distance to Ocean

Histogram

Histogram (nonzero counts)

In case you were curious…

linearModel <- lm(Total~Dist+Side, data = all)
plot(linearModel)

Are the data overdispersed?

  • Poisson distribution assumes:

\[ {\frac{variance}{mean}} \approx 1 \]

  • Data (with 0s):
var(all$Total)/mean(all$Total)
[1] 9.869822

Zero-inflated GLM

  • Two different models

Can poisson error be met without 0s?

  • Data (without 0s):
var(nonzero$Total)/mean(nonzero$Total)
[1] 6.423672
  • Data (without 0s, log):
var(nonzero$Ln)/mean(nonzero$Ln)
[1] 0.650921
var(nonzero$Log10)/mean(nonzero$Log10)
[1] 0.2826914

Verbal Model - Poisson ANVOCA

  • Does distance to the coast and side of the island impact petrel burrow counts?
Model Variables
Source DF Fctr/Cov R/F
Side 1 Fctr fixed
Dist 1 Cov fixed
SidexDist 1x1 ff

Response variable: count of burrows in quadrat

ANOVA

anova <- aov(Total~Dist+Side+Dist*Side, data = all)
summary(anova)
             Df Sum Sq Mean Sq F value Pr(>F)    
Dist          1     13    13.2   0.488  0.485    
Side          1   2941  2940.9 108.267 <2e-16 ***
Dist:Side     1      2     1.6   0.059  0.809    
Residuals   604  16407    27.2                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Poisson ANCOVA for non-zero data

\[ Total = e^\eta + \varepsilon \] \[ \eta = \beta_0 + \beta_dX_d +\beta_sX_s + \beta_{d x s}X_dX_s \]

Where: d = distance to ocean s = East/West side of island Poisson error distribution

Poisson ANCOVA model in R

poissonModel <- glm(Total~Dist+Side+Dist*Side, data = nonzero, family = poisson(link = log))
plot(poissonModel)

Poisson ANCOVA model in R


Call:
glm(formula = Total ~ Dist + Side + Dist * Side, family = poisson(link = log), 
    data = nonzero)

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)  1.729294   0.119517  14.469  < 2e-16 ***
Dist        -0.004789   0.001156  -4.143 3.43e-05 ***
SideW        0.483828   0.129603   3.733 0.000189 ***
Dist:SideW   0.002607   0.001273   2.048 0.040546 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 1565.0  on 293  degrees of freedom
Residual deviance: 1387.1  on 290  degrees of freedom
AIC: 2380.2

Number of Fisher Scoring iterations: 5

Maybe a Gamma distribution?

gammaModel <- glm(Total~Dist+Side+Dist*Side, data = nonzero, family = Gamma())
plot(gammaModel)

Maybe a Gamma distribution?


Call:
glm(formula = Total ~ Dist + Side + Dist * Side, family = Gamma(), 
    data = nonzero)

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.1638703  0.0479744   3.416 0.000727 ***
Dist         0.0013250  0.0005319   2.491 0.013296 *  
SideW       -0.0559233  0.0503825  -1.110 0.267930    
Dist:SideW  -0.0010386  0.0005586  -1.859 0.063996 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Gamma family taken to be 0.7872139)

    Null deviance: 262.46  on 293  degrees of freedom
Residual deviance: 230.06  on 290  degrees of freedom
AIC: 1664.4

Number of Fisher Scoring iterations: 6

Will proceed with poisson.

Verbal Model - Neg. Binomial GzLM

  • Does distance to the coast and side of the island impact presence of petrel burrows?
Model Variables
Source DF Fctr/Cov R/F
Side 1 Fctr fixed
Dist 1 Cov fixed
SidexDist 1x1 ff

Response variable: presence of burrows in quadrat

ANOVA Table

anova <- aov(Present~Dist+Side+Dist*Side, data = all)
summary(anova)
             Df Sum Sq Mean Sq F value Pr(>F)    
Dist          1   1.23    1.23   6.302 0.0123 *  
Side          1  32.46   32.46 166.078 <2e-16 ***
Dist:Side     1   0.10    0.10   0.516 0.4730    
Residuals   604 118.05    0.20                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Negative Binomial GLM

\[ Odds = e^\eta \]

\[ \eta = \beta_0 + \beta_dX_d +\beta_sX_s + \beta_{d x s}X_dX_s \]

Where: d = distance to ocean s = East/West side of island Error distribution is binomial/ Bernoulli

Neg. Binoimial GLM in R

presentabsent <- glm(Present~Dist+Side+Dist*Side, family = binomial(link = logit), data = all)
plot(presentabsent)

Neg. Binoimial GLM in R

library(statmod)
res <- qresid(presentabsent)
qqnorm(res)
qqline(res)

Neg. Binoimial GLM in R

summary(presentabsent)

Call:
glm(formula = Present ~ Dist + Side + Dist * Side, family = binomial(link = logit), 
    data = all)

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept) -1.635830   0.278333  -5.877 4.17e-09 ***
Dist         0.005048   0.002443   2.067   0.0388 *  
SideW        1.918985   0.357368   5.370 7.88e-08 ***
Dist:SideW   0.001773   0.003495   0.507   0.6120    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 842.21  on 607  degrees of freedom
Residual deviance: 699.34  on 604  degrees of freedom
AIC: 707.34

Number of Fisher Scoring iterations: 4

The Zero Inflated GLzM

library(pscl)
zeroInflated <- zeroinfl(Total ~ Dist+Side+Dist*Side |
                 Dist+Side+Dist*Side,
               dist = 'poisson',
               data = all)
summary(zeroInflated)

Call:
zeroinfl(formula = Total ~ Dist + Side + Dist * Side | Dist + Side + 
    Dist * Side, data = all, dist = "poisson")

Pearson residuals:
    Min      1Q  Median      3Q     Max 
-1.5947 -0.8068 -0.4596  0.1611  8.0925 

Count model coefficients (poisson with log link):
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)  1.749518   0.123315  14.187  < 2e-16 ***
Dist        -0.005510   0.001254  -4.394 1.11e-05 ***
SideW        0.464602   0.133155   3.489 0.000484 ***
Dist:SideW   0.003306   0.001363   2.425 0.015309 *  

Zero-inflation model coefficients (binomial with logit link):
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)  1.6771608  0.2841251   5.903 3.57e-09 ***
Dist        -0.0062192  0.0025813  -2.409    0.016 *  
SideW       -1.9568037  0.3621482  -5.403 6.54e-08 ***
Dist:SideW  -0.0006878  0.0036027  -0.191    0.849    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Number of iterations in BFGS optimization: 12 
Log-likelihood: -1532 on 8 Df
E2 <- resid(zeroInflated, type = "pearson")
N  <- nrow(all)
p  <- length(coef(zeroInflated)) + 1 # '+1' is due to theta
sum(E2^2) / (N - p)
[1] 2.058631

Likelihood Ratio for Side

  • *Excluding interaction term, since the coefficient is very small.
withoutSide <-  zeroinfl(Total ~ Dist | Dist, dist = 'poisson', data = all)
withSide <-  zeroinfl(Total ~ Dist + Side| Dist + Side, dist = 'poisson', data = all)

library(lmtest)
lrtest (withoutSide, withSide)
Likelihood ratio test

Model 1: Total ~ Dist | Dist
Model 2: Total ~ Dist + Side | Dist + Side
  #Df  LogLik Df  Chisq Pr(>Chisq)    
1   4 -1669.9                         
2   6 -1535.4  2 269.01  < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

LR for Distance from Ocean

withoutSide <-  zeroinfl(Total ~ Side | Side, dist = 'poisson', data = all)
withSide <-  zeroinfl(Total ~ Dist + Side| Dist + Side, dist = 'poisson', data = all)

library(lmtest)
lrtest (withoutSide, withSide)
Likelihood ratio test

Model 1: Total ~ Side | Side
Model 2: Total ~ Dist + Side | Dist + Side
  #Df  LogLik Df  Chisq Pr(>Chisq)    
1   4 -1556.8                         
2   6 -1535.4  2 42.972  4.663e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Is dist ecologically meaningful?

What about North/ South?

  • Same equation, same error structure
N_S <-  zeroinfl(Total ~ NS | NS, dist = 'poisson', data = all)
summary(N_S)
E_W <-  zeroinfl(Total ~ Side| Side, dist = 'poisson', data = all)
summary(E_W)

Call:
zeroinfl(formula = Total ~ NS | NS, data = all, dist = "poisson")

Pearson residuals:
    Min      1Q  Median      3Q     Max 
-0.9579 -0.7439 -0.7439  0.2244  8.6921 

Count model coefficients (poisson with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  1.80058    0.03065  58.750  < 2e-16 ***
NSS          0.23093    0.04554   5.071 3.95e-07 ***

Zero-inflation model coefficients (binomial with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -0.2313     0.1129  -2.049 0.040439 *  
NSS           0.6244     0.1649   3.787 0.000152 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Number of iterations in BFGS optimization: 7 
Log-likelihood: -1676 on 4 Df

Call:
zeroinfl(formula = Total ~ Side | Side, data = all, dist = "poisson")

Pearson residuals:
    Min      1Q  Median      3Q     Max 
-1.2605 -0.7826 -0.4860  0.1732  7.5806 

Count model coefficients (poisson with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  1.24436    0.06848   18.17   <2e-16 ***
SideW        0.78317    0.07260   10.79   <2e-16 ***

Zero-inflation model coefficients (binomial with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   1.1196     0.1418   7.897 2.86e-15 ***
SideW        -1.9410     0.1858 -10.444  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Number of iterations in BFGS optimization: 7 
Log-likelihood: -1557 on 4 Df

Both E/W and N/S:


Call:
zeroinfl(formula = Total ~ NS + Side | NS + Side, data = all, dist = "poisson")

Pearson residuals:
    Min      1Q  Median      3Q     Max 
-1.2691 -0.7362 -0.4784  0.1481  8.4415 

Count model coefficients (poisson with log link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  1.07057    0.07375   14.52  < 2e-16 ***
NSS          0.31056    0.04601    6.75 1.48e-11 ***
SideW        0.83449    0.07288   11.45  < 2e-16 ***

Zero-inflation model coefficients (binomial with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   1.0042     0.1829   5.492 3.98e-08 ***
NSS           0.1868     0.1894   0.987    0.324    
SideW        -1.8936     0.1917  -9.878  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Number of iterations in BFGS optimization: 9 
Log-likelihood: -1534 on 6 Df

Bonus: Occupancy Rate

Occupancy Rate

Do the number of active burrows scale with the number of total burrows?

Model?

occupancy <- glm(Active~Total, family = gaussian(link = identity), data = nonzero)
plot(occupancy)

Coefficients


Call:
glm(formula = Active ~ Total, family = gaussian(link = identity), 
    data = nonzero)

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -0.7882     0.1319  -5.977 6.62e-09 ***
Total         0.5968     0.0141  42.321  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 2.501663)

    Null deviance: 5211.16  on 293  degrees of freedom
Residual deviance:  730.49  on 292  degrees of freedom
AIC: 1107.9

Number of Fisher Scoring iterations: 2

Questions?